AITopics | nullt 1 2

Collaborating Authors

nullt 1 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Kernel Regression in Structured Non-IID Settings: Theory and Implications for Denoising Score Learning

Zhang, Dechen, Shi, Zhenmei, Zhang, Yi, Liang, Yingyu, Zou, Difan

arXiv.org Machine LearningOct-20-2025

Kernel ridge regression (KRR) is a foundational tool in machine learning, with recent work emphasizing its connections to neural networks. However, existing theory primarily addresses the i.i.d. setting, while real-world data often exhibits structured dependencies - particularly in applications like denoising score learning where multiple noisy observations derive from shared underlying signals. We present the first systematic study of KRR generalization for non-i.i.d. data with signal-noise causal structure, where observations represent different noisy views of common signals. By developing a novel blockwise decomposition method that enables precise concentration analysis for dependent data, we derive excess risk bounds for KRR that explicitly depend on: (1) the kernel spectrum, (2) causal structure parameters, and (3) sampling mechanisms (including relative sample sizes for signals and noises). We further apply our results to denoising score learning, establishing generalization guarantees and providing principled guidance for sampling noisy data points. This work advances KRR theory while providing practical tools for analyzing dependent data in modern machine learning applications.

artificial intelligence, machine learning, nullt 1 2, (15 more...)

arXiv.org Machine Learning

2510.15363

Country: North America > United States (0.27)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

0d3496dd0cec77a999c98d35003203ca-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 18:28:41 GMT

algorithm, convergence rate, spectral algorithm, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Neural Information Processing SystemsOct-9-2025, 02:29:08 GMT

The widely observed'benign overfitting phenomenon' in the neural network literature raises the challenge to the'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent

artificial intelligence, machine learning, nullt 1 2, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Wisconsin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback

On the Saturation Effects of Spectral Algorithms in Large Dimensions

Lu, Weihao, Zhang, Haobo, Li, Yicheng, Lin, Qian

arXiv.org Machine LearningMar-1-2025

The saturation effects, which originally refer to the fact that kernel ridge regression (KRR) fails to achieve the information-theoretical lower bound when the regression function is over-smooth, have been observed for almost 20 years and were rigorously proved recently for kernel ridge regression and some other spectral algorithms over a fixed dimensional domain. The main focus of this paper is to explore the saturation effects for a large class of spectral algorithms (including the KRR, gradient descent, etc.) in large dimensional settings where $n \asymp d^{\gamma}$. More precisely, we first propose an improved minimax lower bound for the kernel regression problem in large dimensional settings and show that the gradient flow with early stopping strategy will result in an estimator achieving this lower bound (up to a logarithmic factor). Similar to the results in KRR, we can further determine the exact convergence rates (both upper and lower bounds) of a large class of (optimal tuned) spectral algorithms with different qualification $\tau$'s. In particular, we find that these exact rate curves (varying along $\gamma$) exhibit the periodic plateau behavior and the polynomial approximation barrier. Consequently, we can fully depict the saturation effects of the spectral algorithms and reveal a new phenomenon in large dimensional settings (i.e., the saturation effect occurs in large dimensional setting as long as the source condition $s>\tau$ while it occurs in fixed dimensional setting as long as $s>2\tau$).

algorithm, convergence rate, spectral algorithm, (15 more...)

arXiv.org Machine Learning

2503.00504

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay

Li, Yicheng, Gan, Weiye, Shi, Zuoqiang, Lin, Qian

arXiv.org Artificial IntelligenceJan-3-2024

The neural tangent kernel (NTK) theory (Jacot et al., 2018), which shows that the gradient kernel regression approximates the over-parametrized neural network trained by gradient descent well (Jacot et al., 2018; Allen-Zhu et al., 2019; Lee et al., 2019), brings us a natural surrogate to understand the generalization behavior of the neural networks in certain circumstances. This surrogate has led to recent renaissance of the study of kernel methods. For example, one would ask whether overfitting could harm the generalization (Bartlett et al., 2020), how the smoothness of the underlying regression function would affect the generalization error (Li et al., 2023), or if one can determine the lower bound of the generalization error at a specific function? All these problems can be answered by the generalization error curve which aims at determining the exact generalization error of a certain kernel regression method with respect to the kernel, the regression function, the noise level and the choice of the regularization parameter. It is clear that such a generalization error curve would provide a comprehensive picture of the generalization ability of the corresponding kernel regression method (Bordelon et al., 2020; Cui et al., 2021; Li et al., 2023).

generalization, nullt 1 2, spectral algorithm, (12 more...)

arXiv.org Artificial Intelligence

2401.01599

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Distributed Gradient Descent for Functional Learning

Yu, Zhan, Fan, Jun, Zhou, Ding-Xuan

arXiv.org Artificial IntelligenceMay-12-2023

In recent years, different types of distributed learning schemes have received increasing attention for their strong advantages in handling large-scale data information. In the information era, to face the big data challenges which stem from functional data analysis very recently, we propose a novel distributed gradient descent functional learning (DGDFL) algorithm to tackle functional data across numerous local machines (processors) in the framework of reproducing kernel Hilbert space. Based on integral operator approaches, we provide the first theoretical understanding of the DGDFL algorithm in many different aspects in the literature. On the way of understanding DGDFL, firstly, a data-based gradient descent functional learning (GDFL) algorithm associated with a single-machine model is proposed and comprehensively studied. Under mild conditions, confidence-based optimal learning rates of DGDFL are obtained without the saturation boundary on the regularity index suffered in previous works in functional regression. We further provide a semi-supervised DGDFL approach to weaken the restriction on the maximal number of local machines to ensure optimal rates. To our best knowledge, the DGDFL provides the first distributed iterative training approach to functional learning and enriches the stage of functional data analysis.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.07408

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.81)

Add feedback